antization in Append-Only Collections
نویسندگان
چکیده
antization, the pre-calculation and conversion to integers of term/document weights in an inverted index, is a well studied aspect of search engines that substantially improves retrieval eciency. Previous work has considered the impact of quantization on eectiveness–eciency tradeos in retrieval, for example, exploring the relationship between collection size and quantization range in static web collections. We extend previous work to append-only collections and examine whether quantization seings derived from prior time periods can be applied to future time periods. Experiments conrm that previous results generalize to a collection with dierent characteristics and with a dierent ranking function, and that in an append-only collection, we can use previous quantization seings in future time periods without substantial losses in either eectiveness or eciency.
منابع مشابه
Single-Pass Algorithms for Mining Frequency Change Patterns with Limited Space in Evolving Append-Only and Dynamic Transaction Data Streams
In this paper, we propose an online single-pass algorithm MFC-append (Mining Frequency Change patterns in append-only data streams) for online mining frequent frequency change items in continuous append-only data streams. An online space-efficient data structure called ChangeSketch is developed for providing fast response time to compute dynamic frequency changes between data streams. A modifie...
متن کاملOnline Mining Changes of Items over Continuous Append-only and Dynamic Data Streams
Online mining changes over data streams has been recognized to be an important task in data mining. Mining changes over data streams is both compelling and challenging. In this paper, we propose a new, single-pass algorithm, called MFC-append (Mining Frequency Changes of append-only data streams), for discovering the frequent frequency-changed items, vibrated frequency changed items, and stable...
متن کاملFast and Secure Append-Only Storage with Infinite Capacity
Computer forensic analysis, intrusion detection and disaster recovery are all dependent on the existence of trustworthy log files. Current storage systems for such log files are generally prone to modification attacks, especially by an intruder who wishes to wipe out the trail he leaves during a successful break-in. In light of recent advances in storage capacity and sharp drop in prices of sto...
متن کاملHardware-Assisted Intrusion Detection by Preserving Reference Information Integrity
Malware detectors and integrity checkers detect malicious activities by comparing against reference data. To ensure their trustworthy operation, it is crucial to protect the reference data from unauthorized modification. This paper proposes the Soteria Security Card (SSC), an append-only storage. To the best of our knowledge, this work is the first to introduce the concept of an append-only sto...
متن کاملThe Append-Only Web Bulletin Board
A large number of papers on verifiable electronic voting that have appeared in the literature in recent years have relied heavily on the availability of an append-only web bulletin board. Despite this widespread requirement, however, the notion of an append-only web bulletin board remains somewhat vague, and no method of constructing such a bulletin board has been proposed. This paper fills the...
متن کامل